Skip to content

Support Linstor Primary Storage for NAS BnR#12796

Open
abh1sar wants to merge 5 commits intoapache:4.22from
shapeblue:linstor-nas
Open

Support Linstor Primary Storage for NAS BnR#12796
abh1sar wants to merge 5 commits intoapache:4.22from
shapeblue:linstor-nas

Conversation

@abh1sar
Copy link
Copy Markdown
Contributor

@abh1sar abh1sar commented Mar 12, 2026

Description

This PR fixes #12218

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • Build/CI
  • Test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

  1. Backup from Linstor Primary Storage to NAS NFS backup repository
  2. Restore from Backup to Linstor Primary Storage
  3. Create new Instance from Backup to Linstor Primary Storage
  4. Restore Backup to NFS Primary Storage
  5. Create new Instance from Backup to NFS Primary Storage

Validated Data after Restore.

How did you try to break this feature and the system with this change?

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 12, 2026

Codecov Report

❌ Patch coverage is 21.62162% with 58 lines in your changes missing coverage. Please review.
✅ Project coverage is 17.60%. Comparing base (4708121) to head (0f5e9d6).

Files with missing lines Patch % Lines
...ce/wrapper/LibvirtRestoreBackupCommandWrapper.java 29.62% 34 Missing and 4 partials ⚠️
...rg/apache/cloudstack/backup/NASBackupProvider.java 0.00% 14 Missing ⚠️
...apache/cloudstack/backup/RestoreBackupCommand.java 0.00% 6 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               4.22   #12796      +/-   ##
============================================
- Coverage     17.60%   17.60%   -0.01%     
  Complexity    15677    15677              
============================================
  Files          5918     5918              
  Lines        531681   531727      +46     
  Branches      65005    65016      +11     
============================================
- Hits          93623    93620       -3     
- Misses       427498   427545      +47     
- Partials      10560    10562       +2     
Flag Coverage Δ
uitests 3.70% <ø> (ø)
unittests 18.67% <21.62%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

volumePathPrefix = storagePool.getPath();
volumePathPrefix = storagePool.getPath() + "/";
} else if (Storage.StoragePoolType.Linstor.equals(storagePool.getPoolType())) {
volumePathPrefix = "/dev/drbd/by-res/cs-";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can keep this path in LinstorUtil, and use it here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't access LinstorUtil from here.

@sureshanaparti sureshanaparti linked an issue Mar 12, 2026 that may be closed by this pull request
@sureshanaparti sureshanaparti requested a review from rp- March 12, 2026 07:40
@sureshanaparti sureshanaparti added this to the 4.22.1 milestone Mar 12, 2026
@sureshanaparti
Copy link
Copy Markdown
Contributor

Hi @rp- Can you perform some tests with this PR changes and confirm?

@abh1sar
Copy link
Copy Markdown
Contributor Author

abh1sar commented Mar 14, 2026

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@abh1sar a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 17143

@rp-
Copy link
Copy Markdown
Contributor

rp- commented Mar 16, 2026

Hi @rp- Can you perform some tests with this PR changes and confirm?

I created a test cluster and also tested the backup nfs.
Everything worked as expected, I only noticed that the backup .qcow2 file took the FULL space, so it looks like there is no zero discard happening. Is there a way to tell qemu/libvirt to use zero-discard while creating the backup image?

Or would it be possible to later use qemu-img convert -c or virt-sparsify on the backup?
btw. virt-sparsify only worked without using the --in-place option, but then the savings are huge 414M vs. 8GB, also the qemu-img compress gives 527M vs. 8GB.

Maybe also @sbrueseke wants to check this out

@sbrueseke
Copy link
Copy Markdown

sbrueseke commented Mar 17, 2026

As far as I understand the backup framework, it uses virsh for running instances and qemu-img for stopped instances. Is this correct? So for stopped instances adding a -c parameter would be easy and would make sense to me. But virsh does not have a parameter for compression, at least not that I am aware of.
Compressing the backup data is a goal we should achieve.

@rp-
Copy link
Copy Markdown
Contributor

rp- commented Mar 17, 2026

As far as I understand the backup framework, it uses virsh for running instances and qemu-img for stopped instances. Is this correct? So for stopped instances adding a -c parameter would be easy and would make sense to me. But virsh does not have a parameter for compression, at least not that I am aware of. Compressing the backup data is a goal we should achieve.

@sbrueseke 😆 Thank you for thinking, but I actually just wanted to make you aware of this PR and maybe you want to test it.

abh1sar added 3 commits April 5, 2026 17:54
Co-authored-by: Abhisar Sinha <63767682+abh1sar@users.noreply.github.com>
@abh1sar
Copy link
Copy Markdown
Contributor Author

abh1sar commented Apr 5, 2026

@rp- is this PR good to go from your side?
There is another open PR which is adding compression support for NAS backup #12898 which will handle the backup file bloat.

@abh1sar
Copy link
Copy Markdown
Contributor Author

abh1sar commented Apr 5, 2026

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@abh1sar a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@abh1sar abh1sar requested a review from sureshanaparti April 5, 2026 12:26
@blueorangutan
Copy link
Copy Markdown

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 17360

@abh1sar
Copy link
Copy Markdown
Contributor Author

abh1sar commented Apr 6, 2026

@blueorangutan test

@blueorangutan
Copy link
Copy Markdown

@abh1sar a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link
Copy Markdown

[SF] Trillian test result (tid-15811)
Environment: kvm-ol8 (x2), zone: Advanced Networking with Mgmt server ol8
Total time taken: 51470 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr12796-t15811-kvm-ol8.zip
Smoke tests completed. 149 look OK, 0 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File

Copy link
Copy Markdown
Contributor

@rp- rp- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except for the large backup .qcow2 I have no further objections.

@sureshanaparti
Copy link
Copy Markdown
Contributor

Except for the large backup .qcow2 I have no further objections.

@rp- @abh1sar any doc update is required for large backups?

Copy link
Copy Markdown
Contributor

@sureshanaparti sureshanaparti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm

@abh1sar
Copy link
Copy Markdown
Contributor Author

abh1sar commented Apr 7, 2026

Except for the large backup .qcow2 I have no further objections.

I did some testing and found out that qemu-img convert -O used for stopped VM backups doesn't bloat the space.
So for running VMs I added code in nasbackup.sh to use qemu-img convert -O on the backup file by creating a temp file and then replacing it with the backup file.

There is no other way to avoid this with virsh backup-begin, so our best bet is to release the storage space after backup file is created.

@rp- would you be able to verify the fix so that we can merge this in 4.22.1 before the freeze this Friday? Thanks.

@abh1sar
Copy link
Copy Markdown
Contributor Author

abh1sar commented Apr 7, 2026

@blueorangutan package

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses issue #12218 by aligning NAS backup/restore behavior with how Linstor-backed volumes are identified and addressed on KVM hosts, enabling successful restore / create-from-backup workflows when the target primary storage is Linstor.

Changes:

  • Update KVM NAS backup script to name backup qcow2 files using the Linstor volume UUID (derived from /dev/drbd/by-res/cs-<uuid>/0) and add a post-backup sparsification step for Linstor-backed disks.
  • Extend KVM restore wrapper logic to correctly derive volume UUIDs for Linstor paths, restore to Linstor block devices, and attach Linstor volumes as RAW disks.
  • Add restoreVolumeSizes to RestoreBackupCommand and plumb it through NAS restore flows (with test updates to satisfy the new command shape).

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
scripts/vm/hypervisor/kvm/nasbackup.sh Adjusts backup naming for Linstor block devices and adds a Linstor qcow2 re-sparsification step.
plugins/backup/nas/src/main/java/org/apache/cloudstack/backup/NASBackupProvider.java Builds correct restore paths for Linstor (/dev/drbd/by-res/cs-<id>/0) and sends restore volume sizes.
core/src/main/java/org/apache/cloudstack/backup/RestoreBackupCommand.java Adds restoreVolumeSizes field to carry target volume sizing info to the agent.
plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper/LibvirtRestoreBackupCommandWrapper.java Adds Linstor-aware UUID extraction, block-device restore handling, and RAW attach for Linstor.
plugins/hypervisors/kvm/src/test/java/com/cloud/hypervisor/kvm/resource/wrapper/LibvirtRestoreBackupCommandWrapperTest.java Updates mocks for new command fields/pool type requirements.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@abh1sar
Copy link
Copy Markdown
Contributor Author

abh1sar commented Apr 7, 2026

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@abh1sar a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 17385

@rp-
Copy link
Copy Markdown
Contributor

rp- commented Apr 7, 2026

Except for the large backup .qcow2 I have no further objections.

I did some testing and found out that qemu-img convert -O used for stopped VM backups doesn't bloat the space. So for running VMs I added code in nasbackup.sh to use qemu-img convert -O on the backup file by creating a temp file and then replacing it with the backup file.

There is no other way to avoid this with virsh backup-begin, so our best bet is to release the storage space after backup file is created.

@rp- would you be able to verify the fix so that we can merge this in 4.22.1 before the freeze this Friday? Thanks.

Just tested it now and looks good.
Online-backups get compacted afterwards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

Create Instance from Backup Failed

6 participants